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Abstract 

Background: 

Recently there has been a lot of interest in identifying modules at the level of genetic and metabolic networks 
of organisms, as well as in identifying single genes and reactions that are essential for the organism. A goal of 
computational and systems biology is to go beyond identification towards an explanation of specific modules and 
essential genes and reactions in terms of specific structural or evolutionary constraints. 
Methodology: 

In the metabolic networks of Escherichia coli, Saccharomyces cerevisiae and Staphylococcus aureus, we identified 
metabolites with a low degree of connectivity, particularly those that are produced and/or consumed in just a 
single reaction. Using flux balance analysis (FBA) we also determined reactions essential for growth in these 
metabolic networks. We find that most reactions identified as essential in these networks turn out to be those 
involving the production or consumption of low degree metabolites. Applying graph theoretic methods to these 
metabolic networks, we identified connected clusters of these low degree metabolites. The genes involved in 
several operons in E. coli are correctly predicted as those of enzymes catalyzing the reactions of these clusters. 
Furthermore, we find that larger sized clusters are over-represented in the real network and are analogous to a 
'network motif. Using FBA for the above mentioned three organisms we independently identified clusters of 
reactions whose fluxes are perfectly correlated. We find that the composition of the latter 'functional clusters' 
is also largely explained in terms of clusters of low degree metabolites in each of these organisms. 
Conclusion: 

Our findings mean that most metabolic reactions that are essential can be tagged by one or more low degree 
metabolites. Those reactions are essential because they are the only ways of producing or consuming their 
respective tagged metabolites. Furthermore, reactions whose fluxes are strongly correlated can be thought of 
as 'glued together' by these low degree metabolites. The methods developed here could be used in predicting 
essential reactions and metabolic modules in other organisms from the list of metabolic reactions. 
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1 Introduction 



Evolution has produced organisms that are robust to various perturbations, yet the specific knockout of a single 
gene can be lethal to the organism. Similarly, organisms have some redundancy in their metabolic pathways, 
but single reactions whose knockout brings the growth of a cell to a halt — called 'essential' reactions — are also 
known to exist in metabolic networks E| • What properties of a specific gene or reaction, within the context 
of the overall structure and organization of biochemical networks, make it essential for the organism? We show 
that most essential metabolic reactions in Escherichia coli 0], Saccharomyces cerevisiae [5| and Staphylococcus 
aureus [5] can be explained by the fact that they are associated with a low degree metabolite. Metabolic and 
protein interaction networks contain nodes with a large variation in their degree of connectivity [7J |S] . In 
case of protein interaction networks it has been suggested that essentiality of a protein is correlated with its 
degree 8 . Hence, protein interaction networks are vulnerable to removal of highly connected proteins called 
'hubs'. In contrast, for metabolic networks, one is usually interested in the essentiality of reactions rather than 
metabolites. Recently, Mahadevan and Palsson [5] have shown that low degree metabolites are almost as likely to 
be associated with essential reactions as high degree metabolites. We show here that in fact almost all essential 
reactions are explained by virtue of being tagged to some low degree metabolite. 

Another theme in systems and computational biology has been to identify genetic regulatory modules |1U1 
^2 EL functional clusters J3]-^H1 an d graph-theoretic modules ^3 EH m metabolic networks. Modularity of 
complex biological networks contributes to the robustness, flexibility, and evolvability of organisms, and also 
towards making their organization more comprehensible (21) . What structural features of metabolic networks 
cause specific subsets of metabolic reactions to have strongly correlated fluxes? We observe that low degree 
metabolites lead to one such structure in the metabolic network. Such metabolites contribute to a rigidity or 
coherence of reaction fluxes in the network resulting in clusters of highly correlated reactions. For example, in 
any steady state, where the concentrations of all metabolites are constant, a metabolite that can be produced 
in only one reaction and consumed in only one causes both reactions to have equal (or proportional with a fixed 
proportionality constant) fluxes. Maintaining the metabolic network close to a steady state then requires enzymes 
for both reactions to be simultaneously active, and hence the corresponding genes to be co-expressed, resulting 
in a transcription module containing those genes. In this work we first locate metabolites based purely on their 
low degree in the metabolic network. Then we show that clusters of their reactions predict genetic regulatory 
modules, as captured in the structure of operons |2'2I I23| . with a high probability in E. coli. Furthermore, the 
composition of most functional clusters is also explained via the low degree clusters embedded inside them. 

Biological networks have two properties that are currently regarded as unrelated: One, they have functional 
modules, and two, they have single genes or metabolic reactions whose knockout is lethal. An implication of 
the present work is that in metabolic networks, both properties can arise as consequences of the same struc- 
tural property: the existence of low degree metabolites. Our work provides an explanation, rather than just 
identification, of essential reactions and metabolic modules. 

2 Lowest degree metabolites and their clusters. 

A metabolite may be designated as 'uniquely produced' or 'UP' ('uniquely consumed' or 'UC') if, in the 
bipartite graph of reactions and metabolites, the node corresponding to the metabolite has in-degree (out- 
degree) equal to unity; in other words, if there is only one reaction in the network that produces (consumes) 
the metabolite. A metabolite that is both UP and UC (a 'UP-UC metabolite') has the lowest degree in the 
network. Examples of UP-UC metabolites taken from the metabolic networks 0J|3] of E. coli and S. aureus can 
be seen in Fig. 1. In any metabolic steady state the concentration of a metabolite is fixed; its rate of production 
is equal to that of consumption. Hence for a UP-UC metabolite in any steady state, the flux of the reaction 
producing it is proportional to that of the reaction consuming it, with the proportionality constant determined 
by the stoichiometric coefficients of the metabolite in the two reactions. A 'UP-UC cluster' of reactions may be 
defined as a set of reactions connected by UP-UC metabolites. In a steady state fixing the flux of any reaction in 
the UP-UC cluster fixes the fluxes of all other reactions in the cluster (see Fig. 1). These clusters include linear 
pathways but can also have branched or cyclic structure. UP-UC clusters are special cases of reaction/enzyme 
subsets ^3 El E] and fully coupled reactions or co-sets ^3 El E| ■ Each UP-UC cluster of reactions can be 
replaced by an effective reaction without affecting the steady state performance and can be used to coarse-grain 
metabolic networks E] ■ 



A reaction was designated as 'uniquely producing' or 'UP' ('uniquely consuming' or 'UC') if it produced 
(consumed) a UP (UC) metabolite. The number of UP (UC) reactions in the metabolic networks of E. coli, S. 
cerevisiae and S. aureus were found to be 289 (272), 391 (370) and 277 (218), respectively, while the number of 
reactions that are either UP or UC or both (we refer to this set as 'UP/UC reactions') is 417, 583 and 376. We 
will show below that such reactions play a special role in metabolic networks. 

3 Results 

3.1 Essential reactions are largely explained by UP/UC structure 

We used the flux balance analysis (FB A) 1 24 25 26 approach to determine essential reactions in the metabolic 
networks of E. coli, S. cerevisiae and S. aureus. We computed the steady state optimal flux vectors for each of 
these organisms in aerobic conditions for all permissible single organic carbon sources in a minimal medium. We 
found a feasible solution (with a nonzero growth rate) for 89, 43 and 27 sources in E. coli, S. cerevisiae and S. 
aureus respectively. The list of feasible carbon sources under minimal media in these organisms is provided in 
Supplementary Tables SI, S2 and S3. 

We considered the effect of 'switching off' reactions (by setting their maximum flux equal to zero) one by 
one, on the optimal growth rate for each food source. A reaction was designated as 'essential' for a particular 
food source if switching it off resulted in a zero optimal growth rate under that input condition. We designated a 
reaction as 'globally essential' for an organism if it was essential for all its feasible minimal media under aerobic 
conditions. The number of essential reactions for each minimal media varied between 200 and 240 reactions and 
the number of globally essential reactions was 164 for the E. coli metabolic network. Similarly, we found that 
the number of globally essential reactions in metabolic networks of 5". cerevisiae and S. aureus were 127 and 196 
respectively. 

3.1.1 Most essential reactions either produce or consume a UP or UC metabolite 

Of the 164 globally essential reactions in the E. coli metabolic network, 133 were found to be either UP or UC. 
Similarly, we found a high fraction of globally essential reactions in metabolic networks of S. cerevisiae and 
S. aureus to be UP or UC (see Table 1). This explains why this subset is essential: there is simply no other 
path around these reactions in the entire network to produce or consume some metabolite that is presumably 
required for the eventual production of biomass. In a recent paper 9 Mahadevan and Palsson have determined, 
for each metabolite in the network, the fraction of its reactions that are essential. They have observed that this 
'lethality fraction' of the low degree metabolites is on average comparable to high degree metabolites, and in 
particular, some metabolites with in and out degree unity (that we have designated here as UP-UC metabolites) 
have lethality fraction unity. We present here a stronger result regarding the role of low degree metabolites: most 
essential reactions involve at least one UP or UC metabolite. These reactions may involve other metabolites of 
higher degree, but their essentiality is due to their uniqueness in producing or consuming a low degree metabolite. 

3.1.2 The correspondence between essential and UP/UC reactions is even tighter in the 'reduced 
network' 

To understand the remaining globally essential reactions, we considered a reduced or pruned version of the 
network. Certain reactions in various reconstructed metabolic networks are such that they have a zero flux value 
under all steady states for stoichiometric reasons. These reactions are referred to as 'strictly detailed balanced' 
reactions or 'blocked' reactions ^Jj, and can be removed from the network for any steady state analysis. 
We used a previously described algorithm ^7] to determine blocked reactions in the metabolic networks of E. 
coli, S. cerevisiae and S. aureus. We found 290 (800, 294) of the 1176 (1579, 865) reactions in the E. coli (S. 
cerevisiae, S. aureus) metabolic network to be blocked. We removed the blocked reactions from each network to 
obtain the 'reduced network' for each organism (containing 886, 779 and 571 reactions respectively). 

Note that the essential reactions obtained by implementing FBA on the reduced network are exactly the same 
as those obtained from the original network for each input condition. Hence, instead of requiring a metabolite 
to be UP or UC across the entire metabolic network, we asked if it was UP or UC in the reduced network. 
The set of UP(UC) metabolites and reactions so obtained turns out to be somewhat smaller than the original 
set. In E. coli, S. cerevisiae and S. aureus the new set of UP/UC reactions has 352, 306 and 276 reactions. 



This is so because several reactions that were UP/UC in the original network happen to be blocked and are 
now removed. Conversely some metabolite that was earlier not UP(UC) can now become UP(UC) after the 
removal of a reaction. This adds new reactions to the UP/UC set but this number turns out to be smaller 
than the number removed (details are given in Supplementary Table S4). The new UP(UC) metabolites have, 
by definition, their in (out) degree unity in the reduced network; even in the original network they have a low 
degree (for E. coli their average in (out) degree in the original network is 1.31 (1.33)). We emphasize that the 
reduced network as defined above and hence the set of new UP(UC) reactions is uniquely determined by the 
original network. 

We found that 156 out of the 164 globally essential reactions (95 %) in the E. coli metabolic network to be 
UP or UC in the reduced network. Similarly, we found that almost all globally essential reactions in S. cerevisiae 
and S. aureus were either UP or UC in the reduced network (92 and 93 % respectively; see Table 1) thereby 
underscoring the fact that nodes with a low degree of connectivity play an 'essential' role in metabolism. The 
importance of low-degree nodes in the essential functionality of complex autocatalytic networks has also been 
observed elsewhere [2HJ m a different context. 

This finding provides some insight into the structural or topological origin of essential reactions in metabolic 
networks. It is, of course, obvious that if a certain metabolite is an essential intermediate for the production 
of some biomass metabolite, and if this metabolite is uniquely produced or uniquely consumed, then the corre- 
sponding production or consumption reaction will be essential for the growth of the cell. However the converse 
of this statement — that all essential reactions in the network should have this topological property — is far 
from obvious. Our finding that about 5-8 % of essential reactions do not have this property proves that the 
converse statement is indeed false. Thus the fact that the overwhelming majority (92-95 %) of essential reactions 
have this topological property is a characterization of the nature of metabolic networks found in organisms. We 
remark that we do not as yet understand why the remaining essential reactions happen to be essential. 

3.1.3 Most UP/UC reactions are essential in some condition or other 

We found that there are 352 UP or UC metabolic reactions in the E. coli reduced network. 156 of these 352 
reactions were globally essential, while 288 of these 352 reactions (82 %) were essential for at least one of the 89 
possible minimal media in E. coli. Some of these UP /UC reactions were part of the input pathways of only one 
carbon source, hence they were essential only for that input. In S. cerevisiae 170 out of 306 UP/UC reactions 
(56 %) in the reduced network are essential in at least one input condition, while in S. aureus 257 out of 276 (93 
%) have this property. The substantial difference between S. cerevisiae, a eukaryote, and the two bacteria may 
reflect a more evolved metabolic structure that needs to be further investigated. 

3.1.4 Comparison between computationally determined essential reactions and lethal single gene 
knockouts 

To check the agreement of essential reactions in the E. coli metabolic network with a database |29| of experimen- 
tally determined essential genes in a rich medium, we implemented FBA for a rich medium containing all food 
sources for the E. coli metabolic network |3()| . We found 95 reactions to be essential in this medium for E. coli. 
89 of these 95 reactions were found to be either UP or UC in the reduced network. Of the 95 essential reactions 
in rich medium, information about the corresponding genes was available for only 85 reactions. Of these 85, 
14 reactions had known isozymes, i.e, multiple enzymes catalyzing the reaction, hence the corresponding genes 
are not expected to be essential. Of the remaining 71 reactions, 5 had associated genes whose essentiality was 
undetermined in the database. Of the remaining 66 reactions, 38 reactions had associated genes that had been 
found to be essential in the database which is a fairly high fraction. Conversely, of the 618 essential genes 
determined for E. coli by Gerdes et al, 158 genes were also part of the E. coli metabolic network 0] used for our 
study. 103 of the above 158 essential genes had their products catalyzing only a single reaction in the E. coli 
metabolic network. Of these 103 essential genes, 62 were associated with a UP or UC reaction. Further, using 
the reduced network, we found that 73 of the 103 essential genes were associated with a UP or UC reaction. 
The discrepancy between theoretical prediction and experimental data may be reconciled by the incomplete 
knowledge about possible isozymes for certain reactions or uncharacterized alternative metabolic pathways in 
the present in-silico metabolic model [3]. 



3.2 Low degree clusters predict regulatory modules 

We found that the E. coli metabolic network :4 contained 185 UP-UC metabolites. We determined all UP-UC 
clusters in the network (see methods). The total number of UP-UC clusters in E. coli metabolic network was 
found to be 85; their size distribution is shown by the grey bars in Fig. 2. The list of all reactions in each UP-UC 
cluster for the E. coli metabolic network is given in Supplementary Table S6. We then investigated whether 
the genes coding for the enzymes of the reactions in a UP-UC cluster are part of the same operon in E. coli. 
Genes on the same operon are by definition part of a genetic module since they are coregulated. At the moment 
genes corresponding to enzymes of reactions of the network have been identified for only part of the network. 
Of the 85 UP-UC clusters in the E. coli metabolic network, only 69 clusters had two or more reactions with 
known corresponding genes. We looked at the regulation of these 69 UP-UC clusters using the known operon 
information from RegulonDB [22] and Ecocyc (231 databases. Genes (of reactions within UP-UC clusters) that 
belong to the same operon in E. coli are indicated in Supplementary Table S5. For 42 of the 69 UP-UC clusters, 
we found that two or more genes of the cluster were part of the same operon. Further, 36 of these 42 UP-UC 
clusters had at least half of their genes belonging to the same operon. We also found that 21 UP-UC clusters 
have at least one possible set of constituent genes catalyzing all reactions in the cluster belonging to the same 
operon. 

To show that two genes belonging to a UP-UC cluster in E. coli have greater probability of lying on the 
same operon than otherwise expected, we performed the following test. We found 251 unique genes catalyzing 
various reactions in the 69 UP-UC clusters. If we randomly pick any two of these 251 genes, the probability that 
the two genes lie on the same operon is 0.0057. If we randomly pick a pair of genes that belong to the same 
UP-UC cluster from this set of 251 genes, the probability that the two genes lie on the same operon is 0.29. Thus 
regulatory modules are predicted correctly with a high probability by this method. It is possible that UP-UC 
clusters will find even greater correspondence with regulatory modules when expression data is analysed; our 
comparison rests only on operon data, and only about 25 percent of the transcriptional regulatory network of E. 
coli is presently believed to have been identified (HJ. It would also be interesting to extend this analysis to the 
other two organisms. 

3.3 Large UP-UC clusters are analogous to network motifs 

We asked the question: Is it expected that a network like the E. coli metabolic network of 618 metabolites and 
1176 reactions with 185 UP-UC metabolites will have a distribution of UP-UC clusters as given in Figure 2? To 
answer this question, we compared the distribution of UP-UC clusters in the real E. coli metabolic network with 
a suitably randomized version of the network . The randomized network has the same number of metabolite 
nodes and reaction nodes and the same number of incoming and outgoing links at each node as the real E. coli 
metabolic network (see methods). Averaging over 1000 realizations of the randomized metabolic network we 
found a cluster distribution as shown by the black line in Fig. 2. 

This shows that the actual metabolic network of E. coli has its UP-UC metabolites bunched up next to each 
other, forming larger clusters than expected in random networks with the same local connectivity properties. 
Thus, larger size (size > 8) UP-UC clusters are over-represented in the real E. coli metabolic network, and 
may be collectively considered as analogous to a network motif (as defined in |SE2]), while smaller size (< 3) 
UP-UC clusters are under-represented in the real network, and may be collectively considered analogous to an 
'anti-motif |33| . We also found qualitatively similar results for the metabolic networks of S. cerevisiae and S. 
aureus (data not shown). 

3.4 Low degree metabolites explain perfect clusters 

Correlated reaction sets are sets of reactions in the metabolic network that are always used together in functional 
states of the network. Each flux vector obtained using FBA represents one possible functional state of the 
network. For each feasible minimal medium we obtained one flux vector with a nonzero growth rate. We 
defined an 'active' reaction as one that had a nonzero flux in at least one of the latter flux vectors. Then we 
computed the correlation coefficient among fluxes of the active reactions across these flux vectors in a manner 
analogous to the correlation of gene activity from microarray data across different conditions JU| ( see methods). 
A 'perfect cluster' is a set of reactions whose pairwise correlation coefficients with each other are all unity across 
all sets of conditions. Reactions in perfect clusters have fluxes that are proportional to each other with the same 
proportionality constant under all the flux vectors considered. We found that in the E. coli metabolic network, 



most of the 582 active reactions under 89 input conditions were contained in several perfect clusters of size 2 or 
more (see Table 2). These clusters, reported earlier in [32] overlap highly with the clusters of ^H]. One might ask: 
Why are particular subsets of reactions perfectly clustered to each other. UP-UC clusters provide a structural 
explanation for these perfect clusters. Of the 85 UP-UC clusters in the entire E. coli network, 46 UP-UC clusters 
are in the set of active reactions. All the 46 active UP-UC clusters are subsets of perfect clusters. To further 
explain the observed clustering of reactions in the E. coli metabolic network, we considered UP(UC) metabolites 
in the reduced network. We found 94 UP-UC clusters in the reduced network for E. coli. Table 2 shows that 
most of the perfect clusters in E. coli are explained in terms of UP-UC clusters in the reduced network in the 
sense that UP-UC clusters account for the bulk of reactions in the perfect clusters. Most of the co-sets reported 
in [HQ f° r E. coli are also explained by UP-UC clusters in the reduced network (see Supplementary Table S7). 
Further, we found that most perfect clusters in the metabolic networks of S. cerevisiae and S. aureus are also 
explained by UP-UC clusters in their respective reduced networks (see Supplementary Tables S8 and S9). 

4 Discussion 

In this paper we have observed that the lowest degree metabolites are implicated in two distinct properties of the 
metabolic networks, one, the existence of essential metabolic reactions (and lethal single metabolic gene knock- 
outs), and two, existence of functional clusters in the metabolic networks (and associated regulatory modules). 

To some extent the identification of UP/UC metabolites depends on the way the metabolic network is 
curated. For example, the networks we have used leave out certain non-enzymatic reactions such as protonation- 
deprotonation reactions. Since their inclusion would render some of the presently UP(UC) metabolites non- 
UP(UC), our definition of UP(UC) could be criticized as being somewhat arbitrary. In this context it is worth 
noting that for the networks as they stand, our definition of UP(UC) allows us to establish a connection between 
distinct properties of the network (e.g., between essentiality, a functional property and the UP/UC character, a 
topological property), and that our main findings hold for metabolic networks of three distinct organisms. This 
suggests that UP/UC reactions as defined by us do capture a certain pattern. In our view the important point 
is not that other definitions of the network would obscure the pattern, but rather, that there do exist systematic 
definitions of the network in which a pattern is visible. 

In metabolic networks the very existence of essential reactions is an indicator of the fragility of the system: 
Even though the network has many reaction nodes, the removal of a single essential reaction node destroys the 
functionality of the network completely by blocking the flow of an essential intermediate. Isozymes are a way of 
dealing with this fragility. However, not all essential reactions have isozymes |35) : this means that evolution has 
tolerated this fragility. Our finding that essential reactions are tagged by low degree metabolites may provide 
some insight into why this is the case. Metabolites that participate in very few reactions perhaps do so in part 
because some feature of their chemical structure prohibits ready association with other molecules, i.e., their 
low degree is a consequence of constraints coming from chemistry. Then evolution tolerates the reactions that 
produce or consume such metabolites as essential because chemistry leaves it no choice. 

Alternatively, it could be that this fragility happens to be a byproduct of some other desirable structural 
property that contributes to robustness or evolvability, such as modularity. We have drawn attention to the 
fact that low degree metabolites also play a role in functional clustering of reactions in the metabolic network. 
We have further provided evidence that the UP-UC clusters at the metabolic level correspond, with a high 
probability, to sets of genes forming modules at the regulatory level in E. coli. 

This raises the question: if low degree metabolites contribute to modularity, could it be that the evolutionary 
advantages of that have outweighed the disadvantage of the above mentioned fragility caused by the same low 
degree metabolites? Is it the case that evolution has preferred 'chemically constrained' low degree metabolites 
in spite of the fragility they cause because they contribute to modularity? A goal in biology is to understand 
highly evolved biological organization in terms of simpler and more inevitable structures |36j . Here we have 
presented evidence that certain genetic regulatory modules, in particular certain operons, mirror the low degree 
structure of the metabolites whose production and consumption they regulate. This could be an example of how 
the origin of certain regulatory structure can be traced to simple chemical constraints. 



5 Methods 



5.1 Detection of UP-UC clusters 

We used recently reconstructed metabolic networks of E. coli (version i JR904 [I]) , S. cerevisiae (version iND750 
P]) and S. aureus (version iSB619 5 ) in this study. The networks were downloaded from the site http:// 
gcrg.ucsd.edu/organisms/index.html. Each reversible reaction in the network was converted into two one sided 
reactions. We excluded the external metabolites in the three metabolic networks while determining the UP-UC 
metabolites. For calculating various UP-UC clusters, we first identify all UP-UC metabolites in the bipartite 
graph of the network. We then delete all links in the graph except those going into and out of UP-UC metabolites. 
From this new bipartite graph, we generate a reaction-reaction graph, in which two reactions are connected if 
one consumes a metabolite produced by the other. The weak components of size > 2 of the reaction-reaction 
graph are the various UP-UC clusters in the network. 

5.2 Generation of randomized networks 

We constructed the matrix A = (Ai a ) where Ai a equals 1 if metabolite i is produced in reaction a, -1 if it 
is consumed in reaction a and if it does not participate in reaction a. Each nonzero entry of A defines a 
link in the bipartite graph of metabolites and reactions. Starting from A for the real network, we generated 
randomized networks keeping the degree of each metabolite and reaction node unchanged [23 EU- It is important 
to distinguish between two kinds of links; one coming into a metabolite node from a reaction node and the other 
going out of a metabolite node to a reaction node. All the links or edges in this bipartite graph were divided 
into these two groups. Two links are then randomly selected in one of these two groups and swapped. Before 
swapping, we ensure that the metabolite involved in any link is not already involved in the reaction corresponding 
to the other link. This process of selecting a random pair of links was repeated 18000 times. We verified that 
more than 99.9% of the links were visited at least once. Starting from the real metabolic network, this procedure 
is repeated 1000 times (with different random number seeds), the UP-UC clusters determined for each of the 
1000 realizations of the randomized network and the average taken thereof. 

5.3 Perfect clusters 

Using FBA we obtained v T a , the velocity of reaction a in an optimal steady state corresponding to input condition 
1,1=1,..., M, a = 1, . . . , N, where M is the number of feasible minimal media and N is the number of distinct 
one way reactions in the metabolic network. These define the M flux vectors we consider. Given a set of M 
flux vectors, the correlation coefficient |10| between two active reactions a and (3 is given by 

M 

i=i 

where <fi a = [(1/M) X^/li^a) 2 ] 1 ^ 2 ■ Reactions a and j3 are said to be perfectly correlated in the given set of flux 
vectors if C a p = 1 for that set and all its subsets of flux vectors. A numerical value of C a p > 0.999999 was 
taken as 'unity' for this purpose. Perfect clusters were identified by locating maximal sets of reactions that were 
perfectly correlated to each other pairwise. 
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TABLES 



Organism 


E. coli 


S. cerevisiae 


S. aureus 


Total number of reactions 


1176 


1579 


865 


Number of globally essential reactions 


164 


127 


196 


Number of globally essential reactions 
that are UP or UC in the entire network 


133 


86 


157 


Number of globally essential reactions 
that are UP or UC in the reduced network 


156 


117 


182 



Table 1. Almost all globally essential reactions in E. coli, S. cerevisiae and S. aureus are UP or UC. 



Size of 


Number of 


Number of perfect 


Breakup of explained clusters into 


perfect clusters 


perfect clusters 


clusters explained 


UP-UC clusters in the reduced network 


2 


48 


22 


22 x (2) 


3 


19 


9 


7 x (3) + 2 x (2) 


4 


11 


10 


8 x (4) + 1 x (3) + 1 x (2) 


5 


4 


3 


1 x (4) + 1 x (3+2) + 1 x (2+2) 


6 


1 


1 


1 x (6) 


7 


1 


1 


lx(7) 


8 


2 


2 


1 x (6+2) + 1 x (5+2) 


148 


1 


1 


(14+12+10+9+7+6+6+6+5+5+4+4+ 
4+4+4+3+3+3+2+2+2+2+2+2+2+2) 



Table 2. The size distribution of perfect clusters in the E. colt metabolic network and their explanation 
in terms of UP-UC clusters. 

The third column lists the number of perfect clusters that are explained by UP-UC clusters calculated using the reduced 
network. The fourth column gives the breakup of the explained perfect clusters in terms of UP-UC clusters of various 
sizes. E.g. in the second row the entry 7 x (3) + 2 x (2) implies that 7 UP-UC clusters of size 3 are identical to 7 perfect 
clusters of size 3 and furthermore, two UP-UC clusters of size 2 are subsets of two size 3 perfect clusters. In the fourth 
row the term 1 x (3+2) means that one of perfect clusters of size 5 contained two distinct UP-UC clusters of sizes 3 and 
2. There are 26 UP-UC clusters that are part of the largest perfect cluster of 148 reactions accounting for 125 reactions 
in it. This largest perfect cluster is a subset of reactions that are active for all input conditions and is located near the 
output end of the metabolic network. 



FIGURES 




Figure 1. (a) UP-UC metabolites in the E. coli metabolic network forming a UP-UC cluster of 10 reactions, (b) UP-UC 
metabolites in the S. aureus metabolic network forming a UP-UC cluster of 6 reactions. Rectangles represent reactions 
and ovals metabolites. Yellow ovals represent UP-UC metabolites. Arrows to (from) metabolites represent their produc- 
tion (consumption) in reactions. A blue (red) link represents the production (consumption) of a UP (UC) metabolite. 
Note that UP-UC clusters are not strictly linear pathways. For example, in part (a) the reactions in the cluster are not all 
in a single chain and in part (b) there is a cycle inside the UP-UC cluster. Nevertheless fixing the flux of any one reaction 
in a UP-UC cluster fixes the fluxes of all other reactions in the cluster in any steady state, since the production rate of 
every UP-UC metabolite must be the same as its consumption rate. Hence, in part (a), fixing the flux of reaction GCALD 
fixes the flux of reaction DHNPA2 (because of the intermediate UP-UC metabolite gcald) , which in turn fixes the fluxes of 
reactions HPPK2 and DNMPPA, and so on. All reactions in part (a) and (b) are globally essential in E. coli and S. aureus 
respectively. To reduce clutter, nodes corresponding to h (proton) and h,20 have been omitted. Abbreviation of metabolite 
and reaction names in part (a) are as in pi] and in part (b) as in Jj], The figures have been drawn using Graphviz software. 




Figure 2. Frequency histogram of UP-UC cluster sizes in the E. coli metabolic network (grey bars). Data is shown in 
Supplementary Table S5. The black line is the frequency distribution for the randomized versions of the network (averaged 
over 1000 realizations) that preserve the in- and out-degree of all nodes. Error bars show one standard deviation of the 
randomized ensemble. Inset: Enlargement of the graph for the larger sized clusters. In the real network, larger UP-UC 
clusters (size > 8) occur much more often than in the randomized version (p < 0.001). On the other hand, smaller UP-UC 
clusters (size < 3) occur much less often than in the randomized version (p < 0.001). 
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Supplementary Table S1 : List of carbon sources that gave flux vectors with a nonzero growth 
rate under minimal media for aerobic conditions in the E. coli metabolic network (version 
UR904; Reed et. al. Genome Biology 4: R54.1 (2003)) 



Abbreviation 


Metabolite name 


Abbreviation 


Metabolite name 


12ppd-S 


(S)-Propane-1 ,2-diol 


glu-L 


L-Glutamate 


2ddglcn 


2-Dehydro-3-deoxy-D-gluconate 


giy 


Glycine 


3hcinnm 


3-hydroxycinnamic acid 


glyald 


D-Glyceraldehyde 


3hpppn 


3-(3-hydroxy-phenyl)propionate 


glyc 


Glycerol 


4abut 


4-Aminobutanoate 


glyc3p 


Glycerol 3-phosphate 


ac 


Acetate 


glyclt 


Glycolate 


acac 


Acetoacetate 


gsn 


Guanosine 


acald 


Acetaldehyde 


hdea 


Hexadecanoate (n-C16:0) 


acgam 


N-Acetyl-D-glucosamine 


idon-L 


L-ldonate 


acmana 


N-Acetyl-D-mannosamine 


ins 


Inosine 


acnam 


N-Acetylneuraminate 


lac-D 


D-Lactate 


adn 


Adenosine 


lac-L 


L-Lactate 


akg 


2-Oxoglutarate 


lets 


Lactose 


ala-D 


D-Alanine 


mal-L 


L-Malate 


ala-L 


L-Alanine 


malt 


Maltose 


alltn 


Allantoin 


malthx 


Maltohexaose 


arab-L 


L-Arabinose 


maltpt 


Maltopentaose 


arg-L 


L-Arginine 


malttr 


Maltotriose 


asn-L 


L-Asparagine 


maltttr 


Maltotetraose 


asp-L 


L-Aspartate 


man 


D-Mannose 


but 


Butyrate (n-C4:0) 


man6p 


D-Mannose 6-phosphate 


cit 


Citrate 


melib 


Melibiose 


cytd 


Cytidine 


mnl 


D-Mannitol 


dad-2 


Deoxyadenosine 


oedea 


octadecanoate (n-C18:0) 


dcyt 


Deoxycytidine 


orn 


Ornithine 


dgsn 


Deoxyguanosine 


pppn 


Phenylpropanoate 


dha 


Dihydroxyacetone 


pro-L 


L-Proline 


din 


Deoxyinosine 


ptrc 


Putrescine 


duri 


Deoxyuridine 


pyr 


Pyruvate 


etoh 


Ethanol 


rib-D 


D-Ribose 


fru 


D-Fructose 


rmn 


L-Rhamnose 


fuc-L 


L-Fucose 


sbt-D 


D-Sorbitol 


fum 


Fumarate 


ser-D 


D-Serine 


g6p 


D-Glucose 6-phosphate 


ser-L 


L-Serine 


gal 


D-Galactose 


succ 


Succinate 


galct-D 


D-Galactarate 


sucr 


Sucrose 


galctn-D 


D-Galactonate 


tartr-L 


L-tartrate 


gait 


Galactitol 


thr-L 


L-Threonine 


galur 


D-Galacturonate 


tre 


Trehalose 


gam 


D-Glucosamine 


trp-L 


L-Tryptophan 


glc-D 


D-Glucose 


ttdca 


tetradecanoate (n-C14:0) 


glen 


D-Gluconate 


uri 


Uridine 


gler 


D-Glucarate 


xtsn 


Xanthosine 


gleur 


D-Glucuronate 


xyl-D 


D-Xylose 


gln-L 


L-Glutamine 







Each carbon source was provided along with ammonium, Fe2+, oxygen, phosphate, potassium, proton, sodium, 
sulfate and water for uptake as part of each minimal growth medium. 



Supplementary Table S2: List of carbon sources that gave flux vectors with a 
nonzero growth rate under minimal media for aerobic conditions in the S. cerevisiae 
metabolic network (version iND750; Duarte ef. al. Genome Res. 14: 1298 (2004)) 



Abbreviation 


Metabolite name 


Abbreviation 


Metabolite name 


13BDglcn 


1 ,3-beta-D-Glucan 


glyc 


Glycerol 


4abut 


4-Aminobutanoate 


gsn 


Guanosine 


ac 


Acetate 


ins 


Inosine 


acald 


Acetaldehyde 


mal-L 


L-Malate 


adn 


Adenosine 


malt 


Maltose 


akg 


2-Oxoglutarate 


man 


D-Mannose 


ala-L 


L-Alanine 


melib 


Melibiose 


amet 


S-Adenosyl-L-methionine 


orn 


Ornithine 


arg-L 


L-Arginine 


pap 


Adenosine 3',5'-bisphosphate 


asn-L 


L-Asparagine 


pro-L 


L-Proline 


asp-L 


L-Aspartate 


pyr 


Pyruvate 


cit 


Citrate 


rib-D 


D-Ribose 


cytd 


Cytidine 


sbt-D 


D-Sorbitol 


etoh 


Ethanol 


ser-L 


L-Serine 


fru 


D-Fructose 


succ 


Succinate 


fum 


Fumarate 


sucr 


Sucrose 


gal 


D-Galactose 


tre 


Trehalose 


gam6p 


D-Glucosamine 6-phosphate 


uri 


Uridine 


glc-D 


D-Glucose 


xtsn 


Xanthosine 


gln-L 


L-Glutamine 


xyl-D 


D-Xylose 


glu-L 


L-Glutamate 


xylt 


Xylitol 


giy 


Glycine 







Each carbon source was provided along with ammonium, oxygen, potassium, phosphate, proton, sodium, 
sulfate and water for uptake as part of each minimal growth medium. 



Supplementary Table S3: List of carbon sources that gave flux vectors with a non 
zero growth under minimal media for aerobic conditions in the S. aureus metabolic 
network (version iSB619; Becker and Palsson, BMC Microbiology 5: 8 (2005)) 



Abbreviation 


Metabolite name 


Abbreviation 


Metabolite name 


doy dl 1 1 


M Afvatwl /"ill i^np^Knino 
IN-MUULyl LJ-yiUL/Uoalllll Ic 


Ian n 
IdC U 


U-LdCLdLU 


dOI ld[ II 


iN-MUciyn icuiaiiiiiicuu 


1 

IdC-L 


L-LdCLdLU 


ala-D 


D-Alanine 


lets 


Lactose 


ala-L 


L-Alanine 


malt 


Maltose 


arg-L 


L-Arginine 


man 


D-Mannose 


asp-L 


L-Aspartate 


mnl 


D-Mannitol 


etoh 


Ethanol 


orn 


Ornithine 


fru 


D-Fructose 


pro-L 


L-Proline 


glc-D 


D-Glucose 


rib-D 


D-Ribose 


glen 


D-Gluconate 


ser-L 


L-Serine 


glu-L 


L-Glutamate 


sucr 


Sucrose 


glyc 


Glycerol 


thr-L 


L-Threonine 


glyc3p 


Glycerol 3-phosphate 


tre 


Trehalose 


hdea 


Hexadecanoate (n-C16:0) 







Each carbon source was provided along with ammonium, Cu2+, cytosine, Fe2+, Mg, Mn2+, Ni2+, nicotinate, 
nitrate, nitrite, oxygen, phosphate, potassium, proton, sodium, sulfate, thiamin, water for uptake as part of 
each minimal growth medium. 



Supplementary Table S4: UP(UC) reaction statistics in the original and reduced 
metabolic networks 



Organism 


E. coli 


S. cerevisiae 


S. aureus 


Number of reactions in the original network 


1176 


1579 


865 


INUITIUUI Ul Ur lUctOUUIIo III 11 IU Uliyillcil HUlWUIft 


289 


391 


277 


Number of UC reactions in the origins! network 


272 


o/ u 


c. 1 o 


Mi irnhNor nf 1 IP/I 1^ roQ^tinnc in tho nri/iina! notifl/nrl/ 
MUM lUCI U 1 Un 1 GdUllUI lo III LI Ic Ul iy II Idl 1 ICIWUI r\ 


417 


583 


376 


inuiiiuci ui uiuukgu ludouuiio 




oUU 




Number of UP/UC reactions in the original network that 








are blocked 


136 


386 


174 


Number of reactions in the reduced network 


886 


779 


571 


Number of UP reactions in the reduced network 


245 


218 


224 


Number of UC reactions in the reduced network 


245 


218 


181 


Number of UP/UC reactions in the reduced network 


352 


306 


276 


Number of UP/UC reactions in the reduced network that 








are not UP/UC in the original network 


71 


109 


74 



Supplementary Table S5: Size distribution of UP-UC clusters in E. 
coli network and its randomized versions. 85 UP-UC clusters of 
size ranging from 2 to 10 reactions were found in the network; the 
number of clusters of each size is given in the second column of 
the table. The third column gives the UP-UC cluster size distribution 
for randomized networks with same local connectivity as the real 
network, averaged over 1000 realizations of the randomized 
network. 



Size of UP-UC 
cluster 


Number of clusters in 
real network 


Number of clusters in 
randomized networks 
Mean ±S.D. 


2 


49 


101.32 ±8.60 


3 


16 


22.49 ± 4.28 


4 


5 


6.62 ± 2.38 


5 


7 


2.15 ±1.44 


6 


2 


0.84 ± 0.89 


7 





0.34 ± 0.57 


8 


2 


0.15 ±0.39 


9 


3 


0.06 ± 0.24 


10 


1 


0.02 ±0.14 



Supplementary Table S6: UP-UC clusters of metabolic reactions and the clustering of their genes in operons 



The table lists the 85 UP-UC clusters of various sizes in E. coli metabolic network, computed as described in the main text. The network used is the one compiled by Reed etal. Genome Biology 4: R54.1 (2003). In 
the table below, the abbreviations of reactions and metabolites, the description of the metabolic pathway where the reaction occurs, the reaction equation and Gene-Protein-Reaction (GPR) association is taken from the 
same database. After identifying the gene(s) for each reaction in every cluster using the GPR association, we determined which of the genes associated with a cluster are in the same operon. For a given cluster, genes 
in the same operon are coloured with the same shade (red, brown or pink). Information about operons was obtained from RegulonDB (Salgado et al. Nucleic Acid Res. 32, D303 (2003)) and Ecocyc (Karp et al. Nucleic 
Acid Res. 30,56(2002)). 

Furthermore, we have added information obtained using flux balance analysis as to whether each reaction is 'active' or 'inactive'. Active reactions are those that were found to have a non-zero flux for at least one of the 
89 flux vectors each corresponding to a different minimal medium (see main text), and inactive reactions are those that had a zero flux for all the 89 flux vectors. Reactions for which the corresponding gene name was 
not available in the database have been labelled as NA in the GPR association column. Of these 85 UP-UC clusters, 69 clusters are such that genes are identified for at least two distinct reactions in the cluster. The 
remaining 16 clusters include (a) 8 clusters that do not have at least two identified genes in the network, and (b) 8 forward and reverse direction pairs of the same reversible reaction. Such clusters are shaded blue. Of 
the above 69 clusters, 42 show a significant level of coregulation in that two or more genes in the cluster are part of the same operon. Of these 42, 26 are in the active set and 16 are in the inactive set. 



SIZE TWO CLUSTERS 



S. No. 


Category 


Abbreviation 


GPR Association 


Description 


Reaction 




1 


Active 


BUTCT 


( b2221 and b2222 ) 


Alternate Carbon Metabolism 


accoa + but --> ac + btcoa 




1 


Active 


FA04 


( b3846 and b0221 ) 


Alternate Carbon Metabolism 


btcoa + fad + h2o + nad --> aacoa + fadh2 + h + nadh 




Active 


GALCTND 


b3692 


Alternate Carbon Metabolism 


galctn-D --> 2dh3dgal + h2o 






Active 


DDGALK 


b3693 


Alternate Carbon Metabolism 


2dh3dgal + atp --> 2dh3dgal6p 


+ adp + h 


3 


Inactive 


FRUK 


b2168 


Alternate Carbon Metabolism 


atp + f 1 p --> adp + fdp + h 




3 


Inactive 


FRUpts 


( b2167 and b2169 and b2415 and b2416 ) 


Transport, Extracellular 


fru[e] + pep --> f1p + pyr 






Active 


DHPPD 


b2541 


Alternate Carbon Metabolism 


cechddd + nad --> dhpppn + h 


t- nadh 




Active 


PPPNDO 


( b2538 and b2539 and b2540 and b2542 ) 


Alternate Carbon Metabolism 


h + nadh + o2 + pppn --> cechddd + nad 


5 


Inactive 


DHCIND 


b2541 


Alternate Carbon Metabolism 


cenchddd + nad --> dhcinnm + 


h + nadh 


5 


Inactive 


CINNDO 


( b2538 and b2539 and b2540 and b2542 ) 


Alternate Carbon Metabolism 


cinnm + h + nadh + o2 --> cenchddd + nad 


6 


Active 


GALS3 


b41 19 


Alternate Carbon Metabolism 


h2o + melib -> gal + glc-D 




6 


Active 


MELIBt2 










7 


Active 


DHCINDO 


b0348 


Alternate Carbon Metabolism 


dhcinnm + o2 --> hkntd 




7 


Active 


HKNTDH 


b0349 


Alternate Carbon Metabolism 


h2o + hkntd --> fum + (2) h + op4en 




Active 


HPPPNDO 


b0348 


Alternate Carbon Metabolism 


dhpppn + o2 --> hkndd 






Active 


HKNDDH 


b0349 


Alternate Carbon Metabolism 


h2o + hkndd --> (2) h + op4en 


+ succ 


9 


Active 


OP4ENH 


b0350 


Alternate Carbon Metabolism 


h2o + op4en --> 4h2opntn 




9 


Active 


HOPNTAL 


b0352 


Alternate Carbon Metabolism 


4h2opntn --> acald + pyr 




10 


Inactive 


TRE6PP 


b1897 


Alternate Carbon Metabolism 


h2o + tre6p --> pi + tre 




10 


Inactive 


TREH 


b3519 


Alternate Carbon Metabolism 


h2o + tre --> (2) glc-D 




11 


Active 


RBK 


b3752 


Alternate Carbon Metabolism 


atp + rib-D --> adp + h + r5p 




11 


Active 


RIBabc 


( ( b4231 and b4227 and b4228 and b4229 and 
b4230 ) or ( b3749 and b3751 and b3750 and 
b3748 ) ) 


Transport, Extracellular 


atp + h2o + rib-D[e] --> adp + h 


+ pi + rib-D 


12 


Inactive 


KG6PDC 


(b3581 orb4196) 


Alternate Carbon Metabolism 


3dgulnp + h --> co2 + xu5p-L 




12 


Inactive 


X5PL3E 


b4197 


Alternate Carbon Metabolism 


xu5p-L --> ru5p-L 




13 


Inactive 


TAUDO 


b0368 


Alternate Carbon Metabolism 


akg + o2 + taur --> aacald + co2 + h + so3 + succ 


13 


Inactive 


TAURabc 


( b0365 and b0366 and b0367 ) 


Transport, Extracellular 


atp + h2o + taur[e] --> adp + h 


t pi + taur 



14 


Antix/o 
MUll vc 


ficcn 

xJOOLJ 


UU^HO 


Arninino anH Prnlino Motahnlicm 

r\\ UN III It; dl IU i IUIII IC IVICldUUIIol 1 1 


nli i^n _t_ h _i_ naHnh nli i^ca _i_ narln _l ni 

yiuop + M + I iduui i — > yiuood + i iduu + ui 


i *+ 


Al""*tix/0 

MUll Vc 


\JLU Jl\ 




A rn i n i n o anH Prrilino IVyipf'ahnlicm 
rxiyilllllfc? dl IU i IUIII Ic IVIcldUUIIol 1 1 


diu + yiu i_ — > aujj + yiuou 


I o 


A r*t 1 \ id 


rtrioi-'o 


( hPQ^ft nr h41 1 7 ^ 
^ U^i/OO Ul Lrr I I / J 


/AiyilllllC dl IU riUllllc IVICldUUIIol 1 1 


di y l + 1 1 — > dy 1 1 1 + ou£ 


15 


Active 


AGMT 


b2937 


Arninino qnrl Prnlinp l\/lptahnlicm 
r\i y ii in its di iu i i uin ic iviciduuiioi 1 1 


can m -i- h9n ntrr* -u i iroa 

dyi 1 1 + i i^u — [ju \j + uicd 


1R 
I o 


II IdCUVC 


FTHAAI 

C 1 n/A/AL 


f h9<l<in anH VOAAA \ 
\ U^-'t'+U dl IU U^-^t^t I J 


Poll Fnx/alr^no Rincx/nthocic 

v_> t?i I ci ivciupc Diubyi iu icolo 


otha or'olH _i_ nhA 

CU Id — > dUdlU + 1 II It 


16 


Inantix/p 

II IdUUVC 




b2239 


PpII Fn\/plnnp RinQ\/nthpQiQ 

\_ftSII 1 1 IVvlvUC LJIXJOyl 1 LI ICOlO 


n^np _i_ h^n pthp _i_ nK/n^n -i. h 

yo|Jc + i i^u — cu id + yiyuofj + 1 1 


17 


A ft 1 \ id 

MUll vc 


UnUnUn 


UUUi/O 


Poll Pn\ /a\r*\t~\a Rincx/nthocic 
Ucll LI IVclULJc DlUoyl III ItJolo 


hQn _i_ i liana — _■_ i iQhna 

1 l^U + (JOdyd — j> du + UOI iyd 


17 


Ar*ti\/0 


1 l?9fiAAT 

UtOUnn 1 


hf)1 70, 

UU 1 / C7 


Poll Fn\/olnno Rincx/nthocic 
Weil c\ ivtsiU|Jc DlUoyl ill icoio 


QhmrcAPP -u i ilhnn — --, APP -i- h -u i i91nn 
oi ii i ii orAO i + uoiiyd — aaoi +ii + u^oyd 


I o 


incantix/o 


LU 1 AGO 


h9^7ft 

VJC.O 1 


Poll Pn\ /a\r>K>cs Riocx/nthocic 
Ucll dl IVclUUc DlUoyl III Icolo 


hHoAPP -u kHn9liniH4 — >. APP j. kHn9liniH4n 
1 lUcnUr + rvUU£IIUIU t t — > nvr + PkUU^IIUIUH-U 


1ft 
I o 


Inantix/O 
II IdL-UVC 


EDTXS4 


U 1 OJJ 


Poll Fn\/olnno Rincx/nthocic 
otsii l.i ivt?iu|Jts Diuoyi in icoio 


kHn9liniH4n -u mvrcAPP ~>* APP -u linn nnlH 

r\UU£-ll|JIU+p + lliyiOr\x-/i — r\\ji + ll[Jd UUIU 




Inantix/o 
II id^livc 


ni ptt 

uin i 


^ UO/ OJ3 Ul U^UOC? ) 


Poll Fnx/olnno Rincx/nthocic 

UCll L.I IVclUUti DIUoyi III IColO 


Httn _i_ n1n _i_ h HtHnnli i _l nni 

uiiu + y iu + ii — »> uiuuyiu + upi 


1Q 


Inantix/o 

II IdUUVC 


i uruun 


( M78R nr h?041 \ 

\ UO/ OO Ul Uty't 1 } 


PpII Fn\/plnnp Rincx/nthpcic 
ucii c\ ivciujjc Diuoyi in icoio 


HtHnnli i — ^> HtHn4HRHn -u hPn 
uiu|jyiu — uiujJM-uuuy + ii^u 


c\) 


II IdCUVC 


i urunc 


horns 

U^xJOO 


Poll E— n\ /a\r\t~\a Ri/~(cx/nthocic 
V_> t?l 1 CI IVtJIUUc DlUoyl III lc?olo 


r\tr\riAr\(^rin . HtHrvlHRHm 

ULUUHUDUy — ■> UlUUHUDUIIl 


20 


lnanti\/P 

II IdUUVC 


TDPDRR 
i ur unn 


b2040 


Poll Fn\/plnnp Rincx/nthocic 
veil ci ivdujjc; Diuoyi in icoio 


HtHn4HfiHm _u h -u naHnh HtHnrmn -u narln 

UlUJJ'-rUUUI 1 1 + I l + I ldU|JI l — UIU|JI 1 1 II I + I ldU[J 




Inaz-'tix/o 
II IdCll vc 


ri\ 1 |pvc 


h9fi^ft 


P^footnr onH P rocthotir" Rir^cx/nthocic 
OUIdClUI dl IU r lUoll IclIC OIUUU DIUoyi III IfcJolo 


of i-i i /"»x/c_l _i_ nli i_l — aHn _i_ nli ir*x/c _i_ h _i_ r~ii 

diu + oyo l + yiu i_ > duu + yiuuyo + n + ui 


21 


Inantix/o 
II id^livc 


xj i no 


b2947 


Pnfnntnr anH Prncthotin r^rni in Rincx/nthocic 
OUIdUlUI dl IU r I Uoll IcLIU oiuuu Diuoyi III IColO 


atn _i_ nli ir*x/c _l nix/ aHn _i_ nthrH ± h ± ni 

dip + yiuuyo + yiy — j> dup + yunu + ii + pi 


00 


r\OU vc 


Ol II NR 


UU/ UxJ 


(~^f^far , fr\r a niH Prr*>cthciti*^ ^^rrxi Rirxcx/nthocic 
OUIdUlUI dl IU r lUoll IclIU OIUUU DIUoyi 111 IfcJolo 


Hhar^ _i_ iacr^ — h _i_ (0\ hOn _i_ i~ii _i_ ni iln 

Ul Idp + Idop — J> II + \c.) 1 l^U + pi + U,UII 1 


22. 


Antix/o 


MMDPR 


UU 1 \}<J 


Pnfnntnr anH Prncthotir* f^rni in Rincx/nthocic 

OUIdUlUI dl IU r lUoll iciiu OIUUU DIUoyi III lUolo 


(0\ h a. nrnn j- ni iln ^ rn9 _u nirrnt j- nni 

\£. 1 1 1 t Ul pp + L|U II 1 — \j\JC- + 1 IIUI 1 1 L + ppi 


01 


A fi i \ /a 
MUll Vc 


MMAT 

ININM 1 


UUOOI? 


Prxfa^tnr anH Pmcthoti/ -1 /^rni ir^ Rincx/nthocic 
OUIdUlUI dl IU i lUoll ICIIU xJIUUU DIUoyi III Icolo 


atr"i _i_ h _i_ nirrnt Hnarl _i_ nni 

dip + 1 1 + 1 IIUI 1 11 — > Ul IdU + ppi 


01 


A 1 \ ICS 

rtcuve 


IMnUO 1 


hi 7 AO 


\_/Oiacior anu rrosuietic oroup Diosynuiesis 


aip + unau + n\\ £ * --> amp + n + nau + ppi 


OA 


lnar*tix/o 
II idOUVC 


PMPK 

i IVIn r\ 


U^ I UO 


Pnfcmtnr anH Prncthptin r^rni in Rincx/nthocic 
OUIdUlUI dl IU r I Uoll ICUU vjiuup DIUoyi III Icolo 


4amnrn _i_ atn 9mahmn _l aHn 

*+dl 1 ipi 1 1 + dip — > t-\ 1 Idl II 1 ip + dup 


24 


Inantix/o 

II ldL<UVC 


JJyjppp 


UO*3\30 


Pnfnntnr anH Prncthotin f^rni in Rincx/nthocic 
ouiduiui di iu r luoii itsiiu oiuuu Diuoyi in luoio 


9ma hmn _u 4mnpt7 _u h nni -u thmrnn 

cL\ I idl ll l ip + tl l ipclz. + 1 1 — ppi + U II I n I ip 


OR 

£-0 


A ot 1 \ ftp 

Mcuve 


uL in 1 1 


hQC1 7 
UOD I / 


oiyunie di iu oyi ii ic ivieiduuiibiii 


cldODUl + COd > dOOOd + giy 


25 


Active 


THRD 


b3616 


Pilvninp anH Rprinp ^/lpt3hnlicm 

vJiy ic cii i\j uci ii ic iviciauxjnoi 1 1 


naH -u thr-l Panhi it -u h -u naHh 

1 luU T U II l_ LuUUUl 1 1 I 1 I luUI 1 


26 


Active 


GLCS1 


b3429 


Glycolysis/Gluconeogenesis 


adpglc -> adp + glycogen + h 


26 


Active 


GLGC 


b3430 


Glycolysis/Gluconeogenesis 


atp + g1 p + h -> adpglc + ppi 


27 


Active 


FA03 


( b3845 and b3846 and b1805 and b0221 ) 


Membrane Lipid Metabolism 


atp + (9) coa + (8) fad + (8) h2o + (8) nad + oedea --> (9) accoa 



+ amp + (8) fadh2 + (8) h + (8) nadh + ppi 



27 


Active 


OCDCAt2 


D2344 


Transport, Extracellular 


h[e] + ocdca[e] -> h + oedea 




28 


Inactive 


CYNTAH 


b0340 


Nitrogen 


cynt + (3) h + hco3 --> (2) co2 + nh4 




28 


Inactive 


CYNTt2 


b0341 


Putative Transporters 


cynt[e] + h[e] --> cynt + h 




29 


Active 


PRFGS 


b2557 


Purine and Pyrimidine Biosynthesis 


atp + fgam + gln-L + h2o --> adp + fpram 


+ glu-L + h + pi 


29 


Active 


PRAIS 


D2499 


Purine and Pyrimidine Biosynthesis 


atp + fpram --> adp + air + (2) h + pi 




30 


Inactive 


LYSDC 


(b4131 orb0186) 


Threonine and Lysine Metabolism 


h + lys-L -> 15dap + co2 




30 


Inactive 


CADVt 


b4132 


Transport, Extracellular 


15dap + h[e] + lys-L[e] --> 15dap[e] + h + 


lys-L 


31 


Active 


HSK 


b0003 


Threonine and Lysine Metabolism 


atp + hom-L --> adp + h + phom 




31 


Active 


THRS 


b0004 


Threonine and Lysine Metabolism 


h2o + phom -> pi + thr-L 




32 


Inactive 


TSULabc 


( ( b2425 and b2424 and b2423 and b2422 ) or ( 
b2424 and b2423 and b2422 and b391 7 ) ) 


Transport, Extracellular 


atp + h2o + tsul[e] --> adp + h + pi + tsul 




32 


Inactive 


CYANST 


b3425 


Unassigned 


cyan + tsul --> h + so3 + tcynt 




33 


Active 


DHQS 


b3389 


Tyrosine, Tryptophan, and Phenylalanine Metabolism 


2dda7p -> 3dhq + pi 




33 


Active 


DDPA 


(b2601 orb1704orb0754) 


Tyrosine, Tryptophan, and Phenylalanine Metabolism 


e4p + h2o + pep -> 2dda7p + pi 




34 


Active 


ACGS 


b2818 


Arginine and Proline Metabolism 


accoa + glu-L --> acglu + coa + h 




34 


Active 


ACGK 


b3959 


Arginine and Proline Metabolism 


acglu + atp -> acg5p + adp 




35 


Active 


UAGDP 


b3730 


Cell Envelope Biosynthesis 


acgamlp + h + utp --> ppi + uacgam 




35 


Active 


G1 PACT 


b3730 


Cell Envelope Biosynthesis 


accoa + garni p -> acgaml p + coa + h 




36 


Inactive 


GSPMDA 


b2988 


Arginine and Proline Metabolism 


gtspmd + h2o --> gthrd + spmd 




36 


Inactive 


GSPMDS 


b2988 


Arginine and Proline Metabolism 


atp + gthrd + spmd --> adp + gtspmd + h 


tpi 


37 


Inactive 


HYPOE 


NA 


Cofactor and Prosthetic Group Biosynthesis 


h2o + pyam5p --> pi + pydam 





37 


Inactive 


PYDAMK 


b2418 


Cofactor and Prosthetic Group Biosynthesis 


atp + pydam --> adp + h + pyam5p 


38 
38 


I nar*ti\/p 

II ICIUU VC 

Inactive 


PYDXPP 
PYDXK 


NA 

(b1636orb2418) 


f.ofaptnr 3nrl Prn^thptip f^rm in RinQ\/nthpQiQ 

u*ui duiui cii ivj i ivjou icuu v_Jiv_<ukJ ljiuo y i iu icoio 

Cofactor and Prosthetic Group Biosynthesis 


hpPi 4- nx/nV^n ni 4- n\/nV 

1 \L-\J T kjyUAUU YJ\ T kjyUA 

atp + pydx -> adp + h + pydx5p 


39 


II IciullVc 

Inactive 


pnxpp 
PYDXNK 


NA 

b2418 


OUIdUlUi dl IU r lUoll IclIU VJIUUU DlUoyl III 1 co lis 

Cofactor and Prosthetic Group Biosynthesis 


1 I^U + UUAO|J — > [Jl + (JyUAl 1 

atp + pydxn --> adp + h + pdx5p 


40 
40 


Inarth/p 

ii iduu vc 

Inactive 


DMATT 

I ' I v \t v i i 

GRTT 


b0421 
b0421 


P.nfaptnr anrl Prn^thptip f^rm in Rin^x/nthp^i^ 

VUlu^lUI Ol IU 1 1 UOU ICIIU VJI UUU LJIUOy 1 J LI JCOIO 

Cofactor and Prosthetic Group Biosynthesis 


rlmnn -u inrln nrrln -u nni 

\JI 1 IUU T ILAJU yi UU T UUI 


41 
41 


Active 
Active 


PTRCTA 
ABUTD 


NA 
NA 


Arnininp nnH Prnlinp MptphnliQtn 
f\\ y u in ic di iu i i uni ic iviciduunoi 1 1 

Arginine and Proline Metabolism 


pkn -i- ntrp ^ zlahi itn -u nil i-l 

dr\y t |juu *Tduun i t yiu i— 

4abutn + h2o + nad --> 4abut + (2) h + nadh 


42 
42 


II ldUU vc 

Inactive 


AMAOTr 

/II VI/\V_/ I I 

AMAOTr 


b0774 
b0774 


P.nfaptnr anH PrnQthptip f^rm in RinQ\/nthpQiQ 
ouiduiui di iu r luou icuu uiuuu uiuoyi iu icoio 

Cofactor and Prosthetic Group Biosynthesis 


Odui in t cm ici »- > di i iuu t udi n i 

amob + dann --> 8aonn + amet 


43 
43 


Inaptix/p 

II IdUU vc 

Inactive 


AOXSr 
AOXSr 


b0776 
b0776 


("".ofaptnr anrl Pm^thptip C^rm in RinQ\/nthpQiQ 

V^UI OULUI dl IU 1 1 WOll ICLIv \_JI UUIJ LJIUOy 1 Hi ICOIO 

Cofactor and Prosthetic Group Biosynthesis 


ala-l 4_ h 4- nmrfifl fiannn 4- pnP 4- pna 

did L T 1 1 T kJI 1 lUUd UdUl II 1 T UUt. T UUd 

8aonn + co2 + coa --> ala-L + h + pmcoa 


44 
44 


I naptix/p 

II ICIUU vc 

Inactive 


CBIAT 
CBIAT 


b1270 
b1270 


P.nfaptnr anrl Prn^thptip f^rm in RinQ\/nthpQiQ 

^Ul CIUIUI dl IU 1 1 wll ICUv vJI >JUU LJIUOy 1 III ICOIO 

Cofactor and Prosthetic Group Biosynthesis 


atn 4- phi 4- hPn arlnphi 4- ni 4- nni 

dll_> T UUI T 1 \C-\J dUUUUI T UI T UUI 

adocbi + pi + ppi --> atp + cbi + h2o 


45 
45 


I naptix/p 

II IdUU vc 

Inactive 


GTHOr 
GTHOr 


b3500 


P.Pifaptnr anrl Prn^thptip f^rm in Rin^x/nthp^ic; 

Vvlu^lUI CII IU 1 1 wll ICUv Ul vUU LJIUOy 1 1 LI ICOIO 

Cofactor and Prosthetic Group Biosynthesis 


nthnv 4- h 4- narlnh ^ (C>\ nthrrl 4- narln 

y ii iua t ii t i idukji i \£- f y 11 II U T 1 ICIULJ 

(2) gthrd + nadp --> gthox + h + nadph 


46 
46 


II IdUU VC 

Inactive 


ADK4 
ADK4 


b0474 
b0474 


Mi iplpntiHp ^alx/anp Pnthwnx/Q 
inuuicuuuc odivdyc rdiMWdyo 

Nucleotide Salvage Pathways 


pmn 4- itn nHn 4- irln 

di i i|j t ii|J ^* dU|J t iu|J 

adp + idp --> amp + itp 


47 
47 


Inpptix/p 

II IdUU VC 

Inactive 


TMDPP 
TMDPP 


b4382 
b4382 


Mi iplpntiHp ^nlx/anp Pnthwnx/Q 
i MuuicuLiuc odivdyc Fduivvdyo 

Nucleotide Salvage Pathways 


ni 4- thx/mrl PHrln 4- th\/m 
ui i li iy 1 1 iu ^* c \j\ 1 1_< t ii ly 1 1 1 

2dr1p + thym --> pi + thymd 


48 
48 


Inaptix/p 

II IdUU vc 

Inactive 


ORNBTOT 

\w/ 1 11 N 1 > 1 V»S 1 

CRNBTCT 


b0038 


Oyirlatix/p nhn^nhnrvlatiPin 

waivjuuvu fji ivopi iui y ichiui i 

Oxidative phosphorylation 


hhtpna 4- prn — prnpoa 4- nhhtn 

UUlUUd UI 1 1 UI 1 IUUO T yL/Ull 1 

crncoa + gbbtn --> bbtcoa + crn 


49 
49 


Inactive 
Inactive 


CRNCBCT 
CRNCBCT 


b0038 
b0038 


Oxidative phosphorylation 
Oxidative phosphorylation 


crn + ctbtcoa -> crncoa + ctbt 
crncoa + ctbt -> crn + ctbtcoa 


SIZE THREE CLUSTERS 


.No. 


ategory 


bbreviation 


PR Association 


escription 


eaction 


1 
1 
1 


Active 
Active 
Active 


FFSD 
XYLI2i 
SUCpts 


NA 

b3565 

( b241 7 and b241 5 and b241 6 and b2429 ) 


Alternate Carbon Metabolism 
Alternate Carbon Metabolism 
Transport, Extracellular 


h2o + suc6p --> fru + g6p 
fru --> glc-D 

pep + sucr[e] -> pyr + suc6p 


2 


Active 
Active 
Active 


MICITD 
MCITS 
MCITD 


(b1276orb0118) 

b0333 
b0334 


Alternate Carbon Metabolism 
Alternate Carbon Metabolism 
Alternate Carbon Metabolism 


2mcacn + h2o --> micit 

h2o + oaa + ppcoa -> 2mcit + coa + h 

2mcit -> 2mcacn + h2o 


3 
3 
3 


Inactive 
Inactive 
Inactive 


ALDD19X 

PACCOAL 

PEAMNO 


b1385 
b1398 
b1386 


Alternate Carbon Metabolism 
Alternate Carbon Metabolism 
Alternate Carbon Metabolism 


h2o + nad + pacald --> (2) h + nadh + pac 
atp + coa + pac -> amp + phaccoa + ppi 
h2o + o2 + peamn -> h2o2 + nh4 + pacald 


4 
4 


Active 
Active 
Active 


ACNML 
AMANK 
ACNAMt2 


b3225 
b3222 
b3224 


Alternate Carbon Metabolism 
Putative 

Transport, Extracellular 


acnam --> acmana + pyr 

acmana + atp --> acmanap + adp + h 

acnam[e] + h[e] --> acnam + h 


5 
5 
5 


Active 
Active 
Active 


MTRK 
MTAN 
SPMS 


NA 

b0159 
b0l21 


Arginine and Proline Metabolism 
Arginine and Proline Metabolism 
Arginine and Proline Metabolism 


5mtr + atp --> 5mdr1p + adp + h 
5mta + h2o -> 5mtr + ade 
ametam + ptrc --> 5mta + h + spmd 


6 
6 
6 


Active 
Active 
Active 


KDOPP 
KDOPS 
KDOCT2 


NA 

b1215 
b0918 


Cell Envelope Biosynthesis 
Cell Envelope Biosynthesis 
Cell Envelope Biosynthesis 


h2o + kdo8p --> kdo + pi 

ara5p + h2o + pep -> kdo8p + pi 

ctp + kdo --> ckdo + ppi 


7 
7 


Inactive 
Inactive 


DHPTDC 
RHCCE 


NA 

b2687 


Methionine Metabolism 
Methionine Metabolism 


dhptd --> h2o + hmfurn 
rhcys --> dhptd + hcys-L 



7 


II IdOllvC 


a u PYKNIC: 


o 
O 


Active 


UCCT 

noo I 


o 
O 


Active 


bnoLI 




Active 


UYo 1 L 


Q 

y 


Inactive 


1 /"*Tl-ll 
Lb 1 ML 


y 


Inactive 




Q 

y 


II IdCUVfc; 


IVIOOM 




Active 


UoLYOH 


in 


A of I \ ICS 


Al 1 TM 

MLL 1 IN 


m 

IU 


Active 


Al 1 TAU 
MLL 1 An 


1 1 


Active 


DHDPS 


11 


Active 


DHDPRy 


11 


Active 


THDPS 


12 


Active 


ACHBS 



12 


Active 


12 


Active 


13 


Active 



KARA2i 
DHAD2 
ACLS 



13 


Active 


KARA1 i 


13 


Active 


DHAD1 


14 


Active 


LEUTAi 


14 


Active 


IPMD 


14 


Active 


OMCDC 


15 


Active 


PGCD 


15 


Active 


PSP_L 


15 


Active 


PSERT 


16 


Inactive 


SHCHD2 


16 


Inactive 


SHCHF 


16 


Inactive 


UPP3MT 



b0212 

b0963 

b0505 
b0512 
b0516 
b2478 
b0031 

b0166 

( ( b3670 and b3671 ) or ( b3769 
b3767 ) or ( b0077 and b0078 ) ) 
b3774 
b3771 

( ( b3670 and b3671 ) or ( b3769 
b3767 ) or ( b0077 and b0078 ) ) 
b3774 

b3771 

( b4054 or b3770 ) 
b0073 
b0073 
b2913 
b4388 
b0907 
b3368 
b3368 

( b3368 or b3803 ) 



. No. ategory bbreviation PR Association 



b2052 
b2052 
b2053 
b2049 
NA 

b1277 
b0414 
b0414 

b0134 
b0133 
b0131 

( b0425 or b3774 ) 
b2750 

( b2751 and b2752 ) 
b2762 



1 


Inactive 


GDMANE 


1 


Inactive 


GOFUCR 


1 


Inactive 


GMAND 


1 


Inactive 


MAN1PT2 




Active 


PMDPHT 


2 


Active 


GTPCII2 


2 


Active 


APRAUR 




Active 


DHPPDA2 


3 


Active 


MOHMT 


3 


Active 


PANTS 


3 


Active 


ASP1 DC 


3 


Active 


DPR 




Active 
Active 


ADSK 
SADT2 


4 


Active 


PAPSR 




Methionine Metabolism 
Methionine Metabolism 
Methionine Metabolism 
Methionine Metabolism 
Methylglyoxal Metabolism 
Methylglyoxal Metabolism 
Methylglyoxal Metabolism 
Nitrogen 
Nitrogen 
Nitrogen 

Threonine and Lysine Metabolism 
Threonine and Lysine Metabolism 
Threonine and Lysine Metabolism 
Valine, leucine, and isoleucine metabolism 




ahcys + h2o --> ade + rhcys 
hom-L + succoa --> coa + suchms 
cys-L + suchms --> cyst-L + h + succ 
cyst-L + h2o -> hcys-L + nh4 + pyr 
gthrd + mthgxl --> Igt-S 
h2o + Igt-S --> gthrd + h + lac-D 
dhap --> mthgxl + pi 

(2) h + h2o + urdglyc -> co2 + glx + (2) nh4 
alltn + h2o --> alltt + h 
alltt + h2o --> urdglyc + urea 
aspsa + pyr --> 23dhdp + h + (2) h2o 
23dhdp + h + nadph -> nadp + thdp 
h2o + succoa + thdp --> coa + sl2a6o 
2obut + h + pyr --> 2ahbut + co2 



Valine, leucine, and isoleucine metabolism 
Valine, leucine, and isoleucine metabolism 
Valine, leucine, and isoleucine metabolism 



2ahbut + h + nadph -> 23dhmp + nadp 
23dhmp --> 3mop + h2o 
h + (2) pyr -> alac-S + co2 



Valine, leucine, and isoleucine metabolism 
Valine, leucine, and isoleucine metabolism 
Valine, leucine, and isoleucine metabolism 
Valine, leucine, and isoleucine metabolism 
Valine, leucine, and isoleucine metabolism 
Glycine and Serine Metabolism 
Glycine and Serine Metabolism 
Glycine and Serine Metabolism 
Cofactor and Prosthetic Group Biosynthesis 
Cofactor and Prosthetic Group Biosynthesis 
Cofactor and Prosthetic Group Biosynthesis 




nadh 



alac-S + h + nadph --> 23dhmb + nadp 
23dhmb -> 3mob + h2o 
4mop + glu-L --> akg + leu-L 
3c2hmp + nad --> 3c4mop + h 
3c4mop + h --> 4mop + co2 
3pg + nad -> 3php + h + nadh 
h2o + pser-L --> pi + ser-L 
3php + glu-L --> akg + pser-L 
nad + shcl --> h + nadh + srch 
fe2 + srch --> (3) h + sheme 
(2) amet + uppg3 --> (2) ahcys + h 



SIZE FOUR CLUSTERS 



escnption 



Cell Envelope Biosynthesis 
Cell Envelope Biosynthesis 
Cell Envelope Biosynthesis 
Cell Envelope Biosynthesis 
Cofactor and Prosthetic Group 
Cofactor and Prosthetic Group 
Cofactor and Prosthetic Group 
Cofactor and Prosthetic Group 
Cofactor and Prosthetic Group 
Cofactor and Prosthetic Group 
Cofactor and Prosthetic Group 
Cofactor and Prosthetic Group 
Cysteine Metabolism 
Cysteine Metabolism 
Cysteine Metabolism 



Biosynthesis 
Biosynthesis 
Biosynthesis 
Biosynthesis 
Biosynthesis 
Biosynthesis 
Biosynthesis 
Biosynthesis 





eaction 



gdpddman --> gdpofuc 
gdpofuc + h + nadph --> gdpfuc + nadp 
gdpmann --> gdpddman + h2o 
gdp + h + man1 p --> gdpmann + pi 
5aprbu + h2o --> 4r5au + pi 
gtp + (3) h2o --> 25drapp + for + (2) h 
5apru + h + nadph --> 5aprbu + nadp 
25drapp + h + h2o --> 5apru + nh4 
3mob + h2o + mlthf --> 2dhp + thf 
ala-B + atp + pant-R --> amp + h + pnto-R + 
asp-L + h --> ala-B + co2 
2dhp + h + nadph --> nadp + pant-R 
aps + atp --> adp + h + paps 
atp + gtp + h2o + so4 --> aps + gdp + pi + ppi 
paps + trdrd --> (2) h + pap + so3 + trdox 





Activs 


SULabc 


( ( b2425 and b2422 and b2423 and b2424 1 or ( 
b2424 and b2422 and b2423 and b2413 and 
b3917) ) 


Tran^nort Extrarplli ilar 

1 lUl luk/Vl L 1 1 — A 11 QV^V^ 1 1 ICll 


atn -+- h2o + 9o4Tp1 --:> adn + h + ni + 9o4 


5 


ActivG 


PRAIi 


b1262 


Tyrosine, Tryptophan, and Phenylalanine Metabolism 


pran 2cpr5p 


5 


Active 


IGPS 


b1262 


Tyrosine, Tryptophan, and Phenylalanine Metabolism 


2cpr5p + h --> 3ig3p + co2 + h2o 


5 


Active 


ANS 


(b1263 and b1 264) 


Tyrosine, Tryptophan, and Phenylalanine Metabolism 


chor + gln-L --> anth + glu-L + h + pyr 


5 


Active 


ANPRT 


b1263 


Tyrosine, Tryptophan, and Phenylalanine Metabolism 


anth + prpp --> ppi + pran 



SIZE FIVE CLUSTERS 



.No. 


ategory 


bbreviation 


PR Association 


escnption 


1 


Active 


AST 


b1747 


Arginine and Proline Metabolism 


1 


Active 


SOTA 


b1745 


Arginine and Proline Metabolism 


1 


Active 


SADH 


b1748 


Arginine and Proline Metabolism 


1 


Active 


SGSAD 


b1746 


Arginine and Proline Metabolism 




Active 


SGDS 


b1744 


Arginine and Proline Metabolism 


2 


Active 


MOAT 


b3633 


Cell Envelope Biosynthesis 


2 


Active 


MOAT2 


b3633 


Cell Envelope Biosynthesis 


2 


Active 


LPADSS 


b0182 


Cell Envelope Biosynthesis 


2 


Active 


TDSK 


b0915 


Cell Envelope Biosynthesis 


2 


Active 


USHD 


b0480 


Cell Envelope Biosynthesis 


3 


Active 


PPCDC 


NA 


Cofactor and Prosthetic Group Biosynthesis 


3 


Active 


DPCOAK 


NA 


Cofactor and Prosthetic Group Biosynthesis 


3 


Active 


PPNCL2 


NA 


Cofactor and Prosthetic Group Biosynthesis 


3 


Active 


PNTK 


b3974 


Cofactor and Prosthetic Group Biosynthesis 


3 


Active 


PTPATi 


b3634 


Cofactor and Prosthetic Group Biosynthesis 




Inactive 


ADOCBLS 


b1992 


Cofactor and Prosthetic Group Biosynthesis 


4 


Inactive 


NNDMBRT 


b1991 


Cofactor and Prosthetic Group Biosynthesis 


4 


Inactive 


ADOCBIK 


b1993 


Cofactor and Prosthetic Group Biosynthesis 


4 


Inactive 


ACBIPGT 


b1993 


Cofactor and Prosthetic Group Biosynthesis 




Inactive 


RZ5PP 


b0638 


Cofactor and Prosthetic Group Biosynthesis 


5 


Inactive 


HEMEOS 


b0428 


Cofactor and Prosthetic Group Biosynthesis 


5 


Inactive 


UPPDC1 


b3997 


Cofactor and Prosthetic Group Biosynthesis 


5 


Inactive 


CPPPGO 


b2436 


Cofactor and Prosthetic Group Biosynthesis 


5 


Inactive 


PPPGO 


b3850 


Cofactor and Prosthetic Group Biosynthesis 


5 


Inactive 


FCLT 


b0475 


Cofactor and Prosthetic Group Biosynthesis 




Inactive 


DXPRIi 


b0173 


Cofactor and Prosthetic Group Biosynthesis 


6 


Inactive 


MECDPDH 


b2515 


Cofactor and Prosthetic Group Biosynthesis 


6 


Inactive 


MEPCT 


b2747 


Cofactor and Prosthetic Group Biosynthesis 


6 


Inactive 


CDPMEK 


b1208 


Cofactor and Prosthetic Group Biosynthesis 


6 


Inactive 


MECDPS 


b2746 


Cofactor and Prosthetic Group Biosynthesis 


7 


Active 


DB4PS 


b3041 


Cofactor and Prosthetic Group Biosynthesis 


7 


Active 


RBFSa 


b1662 


Cofactor and Prosthetic Group Biosynthesis 


7 


Active 


RBFK 


b0025 


Cofactor and Prosthetic Group Biosynthesis 


7 


Active 


FMNAT 


b0025 


Cofactor and Prosthetic Group Biosynthesis 


7 


Active 


RBFSb 


b0415 


Cofactor and Prosthetic Group Biosynthesis 



eaction 



arg-L + succoa --> coa + h + sucarg 
akg + sucorn -> glu-L + suegsa 
(2) h + (2) h2o + sucarg --> co2 + (2) nh4 + sucorn 
h2o + nad + suegsa --> (2) h + nadh + sucglu 
h2o + sucglu --> glu-L + succ 
ckdo + lipidA --> cmp + h + kdolipid4 
ckdo + kdolipid4 --> cmp + h + kdo2lipid4 
lipidX + u23ga --> h + lipidAds + udp 
atp + lipidAds --> adp + h + lipidA 
h2o + u23ga --> (2) h + lipidX + ump 
4ppcys + h --> co2 + pan4p 
atp + dpcoa --> adp + coa + h 
4ppan + ctp + cys-L --> 4ppcys + cmp + h + ppi 
atp + pnto-R --> 4ppan + adp + h 
atp + h + pan4p --> dpcoa + ppi 

v rdmbzi --> adocbl + gmp + h 
■ nicrnt --> 5prdmbz + h + nac 
- atp --> adocbip + adp + h 
- gtp + h --> agdpcbi + ppi 
■ h2o --> pi + rdmbzi 




agdpcbi - 
dmbzid + 
adocbi - 
adocbip - 
5prdmbz - 
frdp + h2o + pheme --> hemeO + ppi 
(4) h + uppg3 --> (4) co2 + cpppg3 
cpppg3 + (2) h + o2 --> (2) co2 + (2) h2o 
(1 .5) o2 + pppg9 -> (3) h2o + ppp9 
fe2 + ppp9 --> (2) h + pheme 
dxyl5p + h + nadph --> 2me4p + nadp 
2mecdp + h --> h2mb4p + h2o 
2me4p + ctp + h --> 4c2me + ppi 
4c2me + atp --> 2p4c2me + adp + h 
2p4c2me --> 2mecdp + cmp 
ru5p-D --> db4p + for + h 
4r5au + db4p --> dmlz + (2) h2o + pi 
atp + ribflv --> adp + fmn + h 
atp + fmn + h --> fad + ppi 
(2) dmlz --> 4r5au + ribflv 




pppg9 




SIZE SIX CLUSTERS 



. No. ategory bbreviation PR Association 



Inactive 


GLUTRS 


b2400 


Inactive 


GLUTRR 


b1210 


Inactive 


PPBNGS 


b0369 


Inactive 


HMBS 


b3805 


Inactive 


UPP3S 


b3804 


Inactive 


GISATi 


b0154 


Inactive 


DHNAOT 


b3930 


Inactive 


NPHS 


b2262 


Inactive 


SUCBZS 


b2261 


Inactive 


OXGDC2 


b2264 


Inactive 


SHCHCS2 


b2264 


Inactive 


SUCBZL 


b2260 




. No. ategory bbreviation PR Association 





Inactive 


ECAP_EC 


NA 




Inactive 


ACGAMT 


b3784 




Inactive 


UAG2Ei 


b3786 




Inactive 


UACMAMO 


b3787 




Inactive 


TDPADGAT 


b3790 




Inactive 


TDPAGTA 


b3791 




Inactive 


AADDGT 


b3793 




Inactive 


ACMAMUT 


b3794 


2 


Active 


S7PI 


b0222 


2 


Active 


GMHEPPA 


b0200 


2 


Active 


EDTXS1 


b1054 


2 


Active 


EDTXS2 


b1855 


2 


Active 


AGMHE 


b3619 


2 


Active 


GMHEPAT 


b3052 


2 


Active 


GMHEPK 


b3052 


2 


Active 


LPSSYN_EC 


( b3620 








b3627 a 










. No. ategory bbreviation PR Association 



Active PPTGS NA 

Active PAPPT3 b0087 

Active UAGCVT b3189 

Active UAPGR b3972 

Active UAMAS b0091 

Active UAMAGS b0088 

Active UAAGDS b0085 



escnption 



eaction 



atp + glu-L + trnaglu --> amp + glutrna + ppi 
glutrna + h + nadph --> glulsa + nadp + trnaglu 
(2) 5aop -> h + (2) h2o + ppbng 
h2o + (4) ppbng -> hmbil + (4) nh4 
hmbil --> h2o + uppg3 
glulsa --> 5aop 

dhna + octdp --> 2dmmq8 + co2 + h + ppi 
sbzcoa -> coa + dhna 
2shchc --> h2o + sucbz 
akg + h + thmpp -> co2 + ssaltpp 
ichor + ssaltpp --> 2shchc + pyr + thmpp 
atp + coa + sucbz -> amp + ppi + sbzcoa 



escnption 



■ uacmamu 



eaction 



unagamuf --> eca_EC + h + udcpdp 
uacgam + udcpp -> ump + unaga 
uacgam --> uacmam 

h2o + (2) nad + uacmam --> (3) h + (2) nadh 
accoa + dtdp4addg -> coa + dtdp4aaddg + h 
dtdp4d6dg + glu-L --> akg + dtdp4addg 
dtdp4aaddg + unagamu --> dtdp + h + unagamuf 
uacmamu + unaga --> h + udp 
s7p --> gmhep7p 
gmhep17bp + h2o --> gmheplp 
ddcaACP + kdo2lipid4 --> ACP 4 
kdo2lipid4L + myrsACP --> ACP 
adphep-D,D --> adphep-L,D 
atp + gmheplp + h --> adphep-D,D + ppi 
atp + gmhep7p --> adp + gmhepl 7bp + h 
(3) adphep-L,D + (2) cdpea + (3) ckdo + lipa + (2) udpg 
adp + (2) cdp + (3) cmp + (10) h + lps_EC + (2) udp 



■ unagamu 

+ pi 
t- kdo2lipid4L 
■ lipa 



escnption 



eaction 



uaagmda -> h + peptido_EC + udcpdp 

udcpp + ugmda --> uagmda + ump 

pep + uacgam --> pi + uaccg 

h + nadph + uaccg --> nadp + uamr 

ala-L + atp + uamr --> adp + h + pi + uama 

atp + glu-D + uama --> adp + h + pi + uamag 

26dap-M + atp + uamag --> adp + h + pi + ugmd 



A 
I 


Active 


UolVIUUo 


1 


Active 


UAbr 1 o 




Inactive 


UD7nDT 


d 


Inactive 


UrnnA 


d 


Inactive 


/"""l-IPPI 
UnnrL 


d 


Inactive 




d 


Inactive 


<JIVIdZ.LM 


d. 


inactive 


wIvMVIDLriA 


d 


Inactive 


\Jr\r nIVI 


d 


Inactive 


UMWIVI 1 




Inactive 


/ri\/ipui_iy 
UlvlrnnA 


Q 

o 


Active 


PPMIPli 

r nlVMOII 


O 

o 


Active 


IbrUH 


Q 

o 


Active 


UlCTp 

mo 1 r 


O 

o 


Active 


UICTPT 

no 1 r 1 


Q 

o 


Active 


nlo 1 U 


3 


Active 


IG3PS 


3 


Active 


ATPPRT 


3 


Active 


PRATPP 


3 


Active 


PRAMPC 



b0086 
b0090 
b4040 
b3835 
b4039 
( b3843 or b231 1 ) 
b3833 
b0662 
b2232 
b2232 
b2907 
b2024 
b2022 
b2022 
b2021 
b2020 

( b2023 and b2025 ) 

b2019 

b2026 

b2026 



. No. ategory bbreviation PR Association 



Active 


DNMPPA 


NA 


Active 


DHNPA2 


b3058 


Active 


DHFS 


b2315 


Active 


GTPCI 


b2153 


Active 


HPPK2 


b0142 


Active 


DHPS2 


b3177 


Active 


DNTPPA 


(b1865orb0099) 


Active 


ADCS 


(b3360and b1812) 


Active 


ADCL 


b1096 


Active 


GCALDD 


b1415 



Cell Envelope Biosynthesis 
Cell Envelope Biosynthesis 
Cofactor and Prosthetic Group 
Cofactor and Prosthetic Group 
Cofactor and Prosthetic Group 
Cofactor and Prosthetic Group 
Cofactor and Prosthetic Group 
Cofactor and Prosthetic Group 
Cofactor and Prosthetic Group 
Cofactor and Prosthetic Group 
Cofactor and Prosthetic Group 
Histidine Metabolism 
Histidine Metabolism 
Histidine Metabolism 
Histidine Metabolism 
Histidine Metabolism 
Histidine Metabolism 
Histidine Metabolism 
Histidine Metabolism 
Histidine Metabolism 



Biosynthesis 
Biosynthesis 
Biosynthesis 
Biosynthesis 
Biosynthesis 
Biosynthesis 
Biosynthesis 
Biosynthesis 
Biosynthesis 



alaala + atp + ugmd --> adp + h + pi + ugmda 
uacgam + uagmda --> h + uaagmda + udp 
4hbz + octdp -> 3ophb + ppi 
2oph + (0.5) o2 --> 2ohph 
chor -> 4hbz + pyr 
3ophb + h -> 2oph + co2 
2ombzl + amet --> 2ommbl + ahcys + h 
2ommbl + (0.5) o2 --> 2omhmbl 
2ohph + amet --> 2omph + ahcys + h 
2omhmbl + amet --> ahcys + h + q8h2 
2omph + (0.5) o2 --> 2ombzl 
prfp -> prlp 
eig3p --> h2o + imacp 
h2o + hisp --> histd + pi 
glu-L + imacp --> akg + hisp 
h2o + histd + (2) nad --> (3) h 
gln-L + prlp -> aicar + eig3p 4 
atp + prpp -> ppi + prbatp 
h2o + prbatp --> h + ppi + prbamp 
h2o + prbamp --> prfp 



+■ his-L + (2) nadh 
glu-L + h 



SIZE TEN CLUSTERS 



escnption 



Cofactor and Prosthetic Group Biosynthesis 
Cofactor and Prosthetic Group Biosynthesis 
Cofactor and Prosthetic Group Biosynthesis 
Cofactor and Prosthetic Group Biosynthesis 
Cofactor and Prosthetic Group Biosynthesis 
Cofactor and Prosthetic Group Biosynthesis 
Cofactor and Prosthetic Group Biosynthesis 
Cofactor and Prosthetic Group Biosynthesis 
Cofactor and Prosthetic Group Biosynthesis 
Folate Metabolism 



eaction 



dhpmp + h2o --> dhnpt + pi 

dhnpt --> 6hmhpt + gcald 

atp + dhpt + glu-L -> adp + dhf + pi 

gtp + h2o --> ahdt + for 

6hmhpt + atp -> 6hmhptpp + amp + h 

4abz + 6hmhptpp --> dhpt + h + ppi 

ahdt + h2o --> dhpmp + h + ppi 

chor + gln-L -> 4adcho + glu-L 

4adcho --> 4abz + h + pyr 

gcald + h2o + nad --> glyclt + (2) h + nadh 



Supplementary Table S7: The size distribution of correlated reaction sets (or 'co-sets') in the E. 
coli metabolic network as reported by Reed and Palsson (Genome Research 14:1797-1805 (2004)) 
and their explanation in terms of UP-UC clusters. 



Size of co-sets 


Number of co-sets 


Number of co-sets explained by 
UP-UC clusters 


Breakup of explained co-sets into UP-UC clusters 
in the reduced network 


2 


39 


22 


22x(2) 


3 


12 


8 


7x(3) + 1x(2) 


4 


10 


10 


7x(4) + 2x(3) + 1x(2+2) 


5 


3 


2 


1x(4) + 1x(3+2) 


9 


2 


2 


1x(6+2) + 1x(4+2) 



The notation used in this table is the same as that in Table 2 of the main text. It is evident from the fourth column that the bulk 
of reactions in the co-sets are constituted by UP-UC clusters. 



We comment here on the similarities and differences between the perfect clusters mentioned in the main 
text and correlated reaction sets described by Reed and Palsson. Both approaches identify groups of 
reactions in the network that respond in a 'similar' or correlated fashion across multiple flux vectors. The 
clusters obtained from them overlap significantly. 58 of the 66 co-sets found by Reed and Palsson either 
exactiy match our perfect clusters or differ from them by one or two reactions. 

There are two main differences between the two methods. 

First, Reed and Palsson use 'binarized' flux vectors whose entries only carry information about whether a 
reaction has zero or nonzero flux in a given flux vector. The correlation coefficient we use (given in the 
Methods section of the main text) has, on the other hand, the actual flux values of the reactions. Thus while 
the co-sets of Reed and Palsson would cluster together a pair of reactions that are always off together and 
on together, such a pair would have perfect correlation in our case only if it satisfied the further requirement 
that the fluxes of the two reactions were proportional to each other across all flux vectors (with the same 
proportionality constant). In this sense our definition of perfect clusters is tighter than that of co-sets. Thus, 
for example, reactions 02t, NADH6, CYTB03, SUCDli and DKMPPD are part of the same co-set in the 
case of Reed and Palsson, while they are not perfectly clustered according to our definition. 

Second, we have only one flux vector for each feasible minimal medium in aerobic conditions, and hence 
only 89 flux vectors for E. coli, over which to determine the correlation coefficient. On the other hand Reed 
and Palsson have 56756 flux vectors across 136 growth conditions (88 aerobic and 48 anaerobic 
conditions). Two reactions that are perfectly correlated across a certain set of flux vectors may not be so if 
the set is enlarged. If a new flux vector is such that one of the pair has zero flux and the other a nonzero 
flux in the flux vector, then the pair would not be part of the same co-set. Thus by considering a larger set 
of flux vectors, Reed and Palsson have obtained a smaller number of co-sets than the number of perfect 
clusters we have found. Furthermore, clusters found by Reed and Palsson are in most cases subsets of our 
perfect clusters. 



Supplementary Table S8: The size distribution of perfect clusters in the S. cerevisiae 
metabolic network and their explanation in terms of UP-UC clusters. 



Size of perfect 
clusters 


Number of perfect 
clusters 


Number of perfect 
clusters explained 


Breakup of explained clusters into UP-UC 
clusters in the reduced network 


2 


38 


17 


17x(2) 


3 


10 


5 


2x(3) + 3x(2) 


4 


8 


4 


3x(3) + 1x(2) 


5 




1 


1x(5) 


6 









7 









8 




1 


1x(8) 


9 




2 


1x(7+2) + 1x(4) j 


10 









12 




1 


1x(4+2) 


117 




1 


( 1 8+9+7+7+6+6+5+4+3+3+3+3+3+3+2+2+2+2+ 
2) 



Supplementary Table S9: The size distribution of perfect clusters in the S. aureus 
metabolic network and their explanation in terms of UP-UC clusters. 



Size of perfect 
clusters 


Number of perfect 
clusters 


Number of perfect 
clusters explained 


Breakup of explained clusters into UP-UC 
clusters in the reduced network 


2 


23 


8 


8x(2) 


3 


11 


6 


4x(3) + 2x(2) 


4 


7 


4 


1x(4) + 2x(3) + 1x(2) 


5 


3 


3 


1x(5) + 1x(3) + 1x(2) 


30 


1 


1 


1x(28) 


193 


1 


1 


(9+9+8+7+7+6+6+6+5+5+5+4+4+4+4+4+3+3+ 
3+3+3+2+2+2+2+2+2+2+2+2+2+2+2+2+2+2+ 
2) 



